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A, ffene encoding aldehyde dc hydroffftnase and ose thereof 

The present invention relates to a novel DNA "which encodes aldehyde dehydrogenase 
(SNDH) derived from Gluconobacter oxydans DSM 4025, an expression vector containing 
the DNA and recombinant organisms containing the expression vector. Furthermore, the 

S present invention concerns a process for producing recombinant aldehyde dehydrogenase 
protein and a process for producing L-ascorbic acid (vitamin C) and/or 2-keto-L-gulonic 
acid (2-KGA) from L-sorbosone by using the recombinant aldehyde dehydrogenase pro- 
tein or recombinant organisms containing the expression vector- 
Vitamin C is one of indispensable nutrient fectors for human beings and has been com- 

10 merdally synthesized by the Reichstein process for about 60 years. Synthetic vitamin C is 
also used in animal feeds even though farm animals can synthesize it in their own body. 
Although the Reichstein process has many advantageous points for industrial vitamin C 
production, it still has undesirable problems such as high energy consuming and usage of 
considerable quantities of OTganic andinorganic solvents. Therefore, over the past 

IS decades, many approaches to manufacture vitamin C using ensjymatic conversions, which 
would be more economical as well as ecological, have been investigated. 

The present invention provides a gene coding for an aldehyde dehydrogenase (SNDH), e.g. 
from a oxydans DSM 4025 (FERM BP*3812) as disclosed in US 6,242,233, catalyzing the 
conversion of L-sorbosone not only to 2-keto-L-gulonic add (2-KGA), but also to vitamin 
20 C. 

The present invention provides a novel DNA which encodes aldehyde dehydrogenase 
(SNDH) derived from G. oxydans DSM 4025. The present invention also provides an ex- 
pression vector containing the DNA and recombinant organisms containing the 
expression vector. Furthermore, the present invention provides a process for producing 
25 recombinant aldehyde dehydrogenase protein and a process for producing L-ascorbic acid 

Hei/fin, 27.09 .2002 
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(vitamin C) and/or 2-Jceto-I.gulonic add (2-KGA) firom L-sorbosone by using the 
recombinant aldehyde dehydrogenase protein or recombinant organisms containing the 
expression vector. 

This invention is directed to a nucleic acid molecule comprising a nucleotide sequence 
5 which encodes a protein having the amino acid sequence at positions 32-S78 of SEQ ID 
NO:2. or a protein derived firom that protein by substitution, deletion, insertion or addi- 
tion of one or more amino acids in the amino acid sequence at positions 32-578. of SEQ ID 
NO:2, which has the SNDH activity. This invention is also directed to an expression 
vector comprising such a polynucleotide, especially one which functions in a host cell 
10 belonging to bacterial cells, yeast cells and plant cells. Preferably the host cefl belongs to 
the genera Gluconobacter, Acetobacter, Pseudomonas, Acinetobacter, Klebsiella ox Escherichia. 
This invention is also directed to a recombinant organism comprising such an expression 
vector, especially one which has the polynucleotide on its chromosomal DNA, and 
preferably one which is a nricroorganism belonging to the genera Gluconobacter, 
is Acetobacter, Pseudomonas, Acinetobacter, Klebsiella or Escherichia. 

This invention is also directed to a nucleic acid molecule as described above consisting of a 
polynucleotide comprising the nucleotide sequence at positions 351-2084 of SEQ ID NO:2, 
or a polynucleotide comprising the nucleotide at positions 258-2084 of SEQ ID NO:l, or a' 
polynucleotide capable of hybridizing to tbe above polynucleotides, and which encodes a 
i protein having SNDH activity. 

This invention includes a nucleic acid molecule which comprises the nucleotide sequence 
at positions 351-2084 of SEQ ID NO:l. This invention is also directed to an expression 
vector comprising such a polynucleotide, especially one which functions in a host cell 
belonging to bacterial cells, yeast cells and plant cells. Preferably the host cell belongs to 
the genera Gluconobacter, Acetobacter, Pseudomonas, Acinetobacter, Klebsiella or Escherichia. 
Thrs invention is also directed to a recombinant organism comprising such an expression 
vector, especially one which has the polynucleotide on its chromosomal DNA, and 
preferably one which is a microorganism belonging to the genera Gluconobacter, 
Acetobacter, Pseudomonas, Acinetobacter, Klebsiella or Escherichia. 

This invention is also directed to a process for producing 2-KGA more efficiently by using 
a disruptant of the SNDH gene of G oxydans DSM 4025 (FERM BP-3812). 

As usedherein, "doning vector" means a plasmid or phage DMA or other DNA sequence . 
which is able to replicate autonomously in a host cell, and which is characterized by one or 
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a small number of restriction endonudease recognition sites at which such DNA sequences 
maybe cut in a determinable feshion without loss of an essential biological function of the 
vector, and into which a DNA fragment may be spliced in order to bring about its replica- 
tion and cloning- The doning vector may further contain a marker suitable for use in the 
identification of cells transformed with the doning vector- Markers, e.g., provide tetra- 
cycline resistance or ampidllin resistance. 

As used herein, "expression" refers to the process by which a polypeptide is produced from 
a structural gene- The process involves transcription of the gene into mRNA and the trans- 
lation of such mRNA into polypeptide(s). 

As used herein, "expression vector 1 * means a vector similar to a doning vector but which is 
capable of enhancing the expression of a gene that has been doned into it, after transfor- 
mation into a host. The doned gene is usually placed under the control of (Le.> operably i 
linked to) certain control sequences such as promoter sequences. Promoter sequences 
may be either constitutive or indudble. 

As used herein, "gene" refers to a DNA sequence that contains information needed for ex- 
pressing a polypeptide or protein. 

As used herein, "nudeic acid molecule* includes both DNA and RNA and, unless other- 
wise specified, indudes both double-stranded, single-stranded nudeic acid, and nudeo- 
sides thereof- Also induded are hybrids such as DNA-RNA hybrids, DNA-RNA-protein 
hybrids, RNA-protein hybrids, and DNA-pro€ein hybrids. 

As used herein, "host" indudes any prokaiyotic or eukaryotic cell that is the redpient of a 
replicable expression vector or doning vector. A "host,* as the term is used herein, also in- 
dudes prokaiyotic or eukaryotic cells that can be genetically engineered by well known 
techniques to contain desired gene(s) on its chromosome or genome. Examples of such 
hosts are known to the skilled artisan. 

As used herein, "mutation" refers to a single base pair change, insertion or ddetion in the 
nucleotide sequence of interest 

As used herein, "mutagenesis'* refers to a process whereby a mutation is generated in DNA 
With "random" mutagenesis, the exact site of mutation is not predictable, occurring any- 
where in the chromosome of the microorganism, and the mutation is brought about as a 
result of physical damage caused by agents such as radiation or chemical treatment 
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As used herein, "operon" refers to a unit of bacterial gene expression and regulation, in- 
cluding the structural genes and regulatory elements in DNA. 

As used herein, "phenotype" refers to observable physical characteristics dependent upon 
the generic constitution of a microorganism. 

; As used herein, "promoter" means DNA sequence generally described as the S' region of a 
gene, located proximal to the start codon. The transcription of an adjacent gene(s) is ini- 
tiated at the promoter region. If a promoter is an inducible promoter, then the rate of 
transcription increases in response to an inducing agent. In contrast, the rate of transcrip- 
tion is not regulated by an inducing agent if the promoter is a constitutive promoter. 

As used herein, "recombinant" means a recombinant host which maybe any prokaryotic 
or eukaryotic cell and contains the desired cloned gcne( S ) on an expression vector or clon- 
ing vector. This term also include those prokaryotic or eukaryotic cells that have been 
genetically engineered to contain the desired gene(s) in the chromosome or genome of 
that organism. 

As used herein, " recombinant vector" includes any cloning vector or expression vector 
which contains die desired cloned gene(s). 

As used herein, "SNDH" stands for L-sorbosone dehydrogenase and "RHA" stands for 
RNAhelicaseA. 

As used herein, "percent identical" refers to the percent of the amino acids of the subject 
amino acid sequence that have been matched to identical amino adds in the compared 
amino acid sequence by a sequence analysis program as exemplified below. 

The invention provides an isolated nucleic add molecule encoding the enzyme (SNDH). 
Methods and techniques designed for the manipulation of isolated nuddc add molecules 
are well known in the art Methods for the isolation, purification, and doning of nudeic 
add molecules, as well as methods and techniques describing the use of eukaryotic and 
prokaryotic host cells and nuddc add and protein expression therein, are known to the 
skilled person. 

Functional derivatives are denned on the bads of the amino add sequences of the present 
invention by addition, insertion, ddetion and/or substitution of one or more amino add 
reridues of such sequences wherein such derivatives still have the SNDH activity measured 
by an assay known in the art or spedfically described herein. Such functional derivatives 
can be made either by chemical peptide synthesis known in the art or by recombinant 
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techniques on the basis of the DNA sequences as disclosed herein by methods known in 
the state of the art. Amino add exchanges in proteins and peptides which do not generally 
alter the activity of such molecules are known in the state of the art. 

Tn particular embodiments of the present invention, conservative substitutions of interest 
occur as follows: As example substitutions, Ala to Val/Leu/Ile, Arg to Lys/Gln/Asn, Asn to 
Gln/His/Lys/Arg, Asp to Glu, Cys to Ser, Gin to Asn, Glu to Asp, Gly to Pro/Ala, His to 
Asn/Gln/Lys/Arg, lie to Leu/Val/Met/Ala/Phe/norLeii, Lys to Arg/Gln/Asn, Met to 
Leu/Phe/tle, Phe to Leu/Val/Ile/Ala/Tyr, Pro to Ala, Ser to Thr, Thr to Ser, Trp to Tyr/Phe, 
Tyr to Trp/Phe/Thr/Ser, and Val to Ile/Leu/Met/Phe/Ala/norLeu are reasonable. As pre- 
ferred examples, Ala to Val, Arg to Lys, Asn to Gin, Asp to Glu, Cys to Ser, Gin to Asn, Glu 
t Asp, Gly to Ala, His to Arg, lie to Leu, Leu to lie, Lys to Arg, Met to Leu, Phe to Leu, Pro 
to Ala, Ser to Thr, Thr to Ser, Tip to Tyr, Tyr to Phe, and Val to Leu are reasonable. If 
such substitutions result in a change in biological activity, then more substantial changes, 
denominated exemplary substitutions described above, are introduced and the products 
screened. 

Unless otherwise mentioned, all amino acid sequences determined by sequencing the puri- 
fied SNDH protein herein were determined using an automated amino add sequencer 
(such as model 470A, Perkin-Elmer Applied Biosystems). 

Unless otherwise indicated, all nucleotide sequences determined by sequencing a DNA 
molecule herein were determined using an automated DNA sequencer (such as the model 
ALF express H, Amersham Pharmada Biotech), and all amino add sequences of polypep- 
tides encoded by DNA molecules determined herein were predicted by translation of the 
DNA sequence determined as above. Therefore, as is known in the art for any DNA 
sequence determined by this automated approach, any nudeotide sequence determined 
herein may contain some errors. Nudeotide sequences determined by automation are 
typicaUy at least about 90% identical, more typically at least about 95% to at least about 
99.9% identical to the actual nudeotide sequence of the sequenced DNA molecule. The 
actual sequence can be more predsdy determined by other approaches induding manual 
DNA sequencing methods well known in the art As is also known in the art, a single 
insertion or deletion in a determined nucleotide sequence compared to the actual sequence 
will cause a frame shift in translation of the nudeotide sequence such that the predicted 
amino add sequence encoded by a determined nudeotide sequence will be completdy 
different from the amino add sequence actually encoded by the sequenced DNA molecule, 
b eginnin g at the point of such an insertion or deletion. 
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Furthermore the present mvention is directed to polynucleotides encoding polypeptide 
having the SNDH activity as disclosed in the sequence listing as SEQ ID NO:2 as well as the 
complementary strands, or those which include these sequences, DMA sequences or 
fragments thereof, and DNA sequences, which hybridize under standard conditions with 
5 such sequences but which encode for polypeptides having exactly the same amino acid 
sequence. 

Standard conditions for hybridization mean in this context the conditions which are 
generally used by a person skilled in the art to detect specific hybridization signals, or pre- 
ferably so called "stringent hybridization conditions* used by a person skilled in the art 

10 Thus, as used herein, the term "stringent hybridization conditions" means hybridization 
will occur if there is 95% and preferably at least 97% identity between the sequences. 
Stringent hybridization conditions are, e.g., conditions under over night incubation at 
42°C using a digoxigenin (DIG) -labeled DNA probe (constructed by using a DIG labeling 
system; Roche Dignostics GmbH, 68298 Mannheim, Germany) in a solution comprising 

is 50% formamide, 5 x SSC (150 mM NaCl, 15 mM trisodium citrate), 0.2% sodium dodecyl 
sulfate, 0.1% N-Iauroylsarcosine, and 2% blocking reagent (Roche Dignostics GmbH), 
followed by washing the filters in 0. lx SSC at about 60°C. 

Briefly, the SNDH gene, the DNA molecule containing said gene, the recombinant expres- 
sion vector and the recombinant organism used in the present invention can be obtained 
0 by the following steps: 

(1) Isolating chromosomal DNA from G. oxydans DSM 4025 and constructing the gene 
library with the chromosomal DNA in an appropriate host cell, e.g. R coli. ) 

(2) Cloning the SNDH gene from a chromosomal DNA by colony-, plaque-, or Southern- 
hybridization, PCR (polymerase chain reaction) cloning, western-blot analysis and so on. 

(3) Determining the nucleotide sequence of the SNDH gene obtained as above by conven- 
tional methods to select DNA molecule containing said SNDH gene and constructing the 
recombinant expression vector on which SNDH gene can express efficiently. 

(4) Constructing recombinant organisms carrying SNDH gene by an appropriate method 
for introducing DNA into host cell, e.g. transformation, transduction, conjugal transfer 
and/or electroporation, which host cell thereby becomes a recombinant organism of this 
invention. 

The materials and the techniques used in the above aspect of the present invention are 
exemplified in detail as follows: 
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A total chromosomal DNA can be purified by a procedure well known in the art The 
desired gene can be cloned in either plasmid or phage vectors from a total chromosomal 
DNA typically by either of the following illustrative methods: 

(i) The partial amino add sequences are determined from the purified proteins or peptide 
5 fragments thereof. Such whole protein or peptide fragments can be prepared by the isola- 
tion of such a whole protein or by peptidase- treatment from the gel after SDS-polyacryi- 
amide gel electrophoresis. Thus obtained protein or fragments thereof are applied to pro- 
tein sequencer such as Applied Biosystems automatic gas-phase sequencer 470A- The 
amino add sequences can be utilized to design and prepare oligonudeotide probes and/or 

10 primers with DNA synthesizer such as Applied Biosystems automatic DNA sequencer 
381A. The probes can be used for isolating dones carrying the target gene from a gene 
library of the strain carrying the target gene by means of Southern-, colony- or plaqtie- 
hybridization. * 

(ii) Alternatively, for the purpose of selecting dones expressing target protein from the 
15 gene library* immunological methods with antibody prepared against the target protein 

can be applied. 

(iii) Hie DNA fragment of the target gene can be amplified from the total chromosomal 
DNA by PCR method with a set of primers, Le» two olxgonudeotides synthesized according 
to the amino add sequences determined as above. Then a done carrying the target-whole 

20 gene can be isolated from the gene library constructed, e-g. in R coti by Southern-, colony-, 
or plaque-hybridization with the PCR product obtained above as the probe. 

DNA sequences which can be madeJby.PCR by using primers designed on the basis of the 
DNA sequences disdosed therein by methods known in the art are also an object of the 
present invention, 

25 Above mentioned antibody can be prepared with the purified SNDH proteins, the purified 
recombinant SNDH proteins such as His- tagged SNDH expressed in R coli^ or its peptide 
fragment as an antigen. 

Once a done carrying the desired gene is obtained, the nudeotide sequence of the target 
gene can be determined by a wdl known method such as dideoxy chain termination 
30 method with M13 phage. 

The gene encoding the L-sorbosone dehydrogenase activity of the present invention is 
illustrated in Figure 1, showing a restriction map of the SNDH and ORF-A genes wherein 
ORF means open reading frame and Signal seq. means the putative signal peptide sequence 
of the SNDH gene, and Figure 2, showing a physical map of the insert DNA fragments of 
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coanid pVSN5, and pUC plasmids pUCSNP4, pUCSNP9, pUCSN19, and pUCSNS. In 
the physical map of pVSN5, the arrow filled in gray shows the SNDH gene. 

This specific gene encodes the SNDH enzyme having 578 amino acid residues together 
with a putative signal peptide of 3 1 amino acid residues (SEQ ID NO:2). In terms of 
S nucleotide sequences, the coding region of the SNDH gene is positions at 258-2087 of SEQ 
ID NO:l and includes the signal peptide (258-350) and the stop codon (2085-2087). Thus 
the sequence without the stop codon is positions at 258-2084 of SEQ ID NO:l, and 
additionally without the signal sequence is positions at 35 1-2084 of SEQ ID NO:l- 

To express the desired gene/nudeotide sequence isolated from G. oxydans DSM 4025 
10 efiSdently, various promoters can he used,- e.g., the original promoter of the gene, pro- 
moters of antibiotic resistance genes such as kanamydn resistant gene of Tn5, ampicfllin - 
resistant gene of pBR322, and beta-galactosidase of R coU (lac), trp-, tac-, trc-promoter, 
promoters of lambda phage and any promoters which can be functional in a host cell For 
this purpose, the host cell can be selected from a group consisting of bacterial cells, plant 

IS cells, and yeast cells. Preferably the host cell belongs to the genera Gluconobacter, Aceto- 
bacter, Pseudomonas, Acimtobaaer, Klebsiella or Escherichia. Among the preferred host 
cells, most preferably it belongs to C. oxydans, e.g„ G. oxydans DSM 4025 (FERM BP- 
3812), which had been deposited as DSM 402S on Mar. 17, 1987 under the conditions of 
the Budapest Treaty at the Deutsche Sammlung von Mikrooiganismen und Zefflculturen 

20 , GmbH, Braunschweig, Germany. 

For expression, other regulatory dements, such as a Shine-Dalgarno (SD) sequence (e.g., 
AGGAGG and so on induding natural and synthetic sequences operable in the host cell) 
and a transcriptional terminator (inverted repeat structure induding any natural and syn- 
thetic sequence operable in the host cell) which are operable in the host cell (into which 
25 the coding sequence will be introduced to provide a recombinant cell of this invention) 
can he used with the above described promoters. 

For the expression of polypeptides which locate in periplasmic space, like the SNDH pro- 
tein of the present invention, a signal peptide, which contains usually 15 to 50 amino add 
residues and is totally hydrophobic, is preferably assodated. A DNA encoding a signal 
30 peptide can be sdected from any natural and synthetic sequence operable in the desired 
host cefl. 

A wide variety of host/doning vector combinations may be employed in doning the 
double stranded DNA. Preferred vectors for the expression of the gene of the present 
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invention in R coli is selected from any vectors usually used in R caH, such as pQE vectors 
which can express His-tagged recombinant proteins (QlAGfiN K-K-, Tokyo, Japan), 
pBR322 or its derivatives including pUC18 and pBluescript II (Stratagcne Cloning 
Systems, Calif., USA), pACYC177 and pACYC184 and their derivatives, and a vector 

5 derived from a broad host range plasmid such as RK2 and RSF1010. A preferred vector for 
the expression of the nucleotide sequence of the present invention in bacteria including 
Gluconobacter* Acetabacter, Pseudornonas, Acinetobacter or Klebsiella is selected from any 
vectors which can replicate in Glucanobacter^ Acetabacter, Pseudornonas* Acinetobacter, 
and/or Klebsiella as well as in a preferred doning organism such as R coli. The preferred 

10 vector is a broad-host-range vector such as a cosmid vector like pVKlOO and its derivatives 
and RSF1010- Copy number and stability of the vector should be carefully considered for 
Stable and efficient expression of die doned gene and also for efficient cultivation of the 
host cell carrying the cloned gene. Nudeic acid molecules containing transposable 
dements such as Tn5 can also be used as a vector to introduce the desired gene into the 

is preferred host, especially on a chromosome. Nudeic add molecules containing any DNAs 
isolated from tide preferred host together with the gene of the present invention are also 
useful to introduce this gene into the preferred host cell, especially on a chromosome- 
Such nucleic add molecules can be transferred to the preferred host by applying any of a 
conventional method, e.g., transformation, transduction, conjugal mating or 

20 dectroporation, which are well known in the art, considering the nature of the host cell 
and the nuddc add molecule. 

The SNDH gene/nudeotide sequences provided in this invention are ligated into a suitable 
vector containing a regulatory region such as a promoter, a ribosomal binding site, and a 
transcriptional terminator operable in the host cell described above with a well-known 
25 method in the art to produce an expression vector. 

To construct a recombinant microorganism carrying a recombinant expression vector, 
various gene transfer methods induding transformation, transduction, conjugal mating, 
and dectroporation can be used. The method for constructing a recombinant cell may be 
sdected from the methods well-known in the fidd of molecular biology. Conventional 

30 transformation systems can be used for Gluconobacter, Acetobacter* Pseudornonas* Acineto- 
bacter* Klebsiella or Escherichia. A transduction system can also be used for £ coli. Conju- 
gal mating system can be widdy used in Gram-positive and Gram-negative bacteria 
induding R coli P. putida, and Gluconobacter. An example of conjugal mating is disdosed 
in WO 89/0(5,688. The conjugation can occur in liquid medium or on a solid surfece. 

35 Examples for a redpient for SNDH production indude microorganisms of Gluconobacter, 
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Acetobacter, Pseudomonas, Adnetohacter, Klebsiella or Escherichia. To the recipient for 
conjugal mating, a selective marker may be added; e .g., resistance against nahdixic acid or 
rifampidn is usually selected. Natural resistance can also be used; e,g., resistance against 
polymyxin B is useful for many Gluconabacters. 

S The present invention provides recombinant SNDH. One can increase the production 
yield of the SNDH enzyme by introducing the SNDH gene provided by the present inven- 
tion into a host cell including G. oxydans DSM 4025. One can also produce more effi- 
ciently the SNDH proteins in a host cell selected from a group consisting of Gkiconobacter, 
Acetobacter, Pseudomottds, Acmetobacter, Klebsiella or Escherichia by using the SNDH gene 
> of the present invention. The microorganism-may be cultured in an aqueous medium 
supplemented with appropriate nutrients under aerobic conditions. The cultivatidn may 
be conducted at a pH of 4.0 to 9-0, preferably 6.0 to 8.0. The cultivation period varies de- 
pending on the pH, temperature and nutrient medium to be used, and is preferably about 
1 to 5 days. The preferred temperature range for carrying out the cultivation is from about 
13-C to about 36°C, preferably from about WC to about 33°C. It is usually required that 
the culture medium contains such nutrients as assimilable carbon sources, e-g., glycerol, 
D-mannitol, D-sorbitol, erythritol, ribitol, xjditol, arabitol, inositol, duldtol, D-ribose, D- 
fructose, D-glucose, and sucrose, preferably D-sorbitol, D-mannitol, and glycerol; and 
digestible nitrogen sources such as organic substances, e.g., peptone, yeast extract,' bakers 
yeast, urea, amino acids, and corn steep liquor. Various inorganic substances may also be 
used as nitrogen sources; e.g., nitrates and ammonium salts. Furthermore, the culture 
medium usually contains inorganic salts, e.g., magnesium sulfate, potassium phosphate, 
and calcium carbonate. 

An embodiment for the isolation and purification of the recombinant SNDH from the 
microorganism after the cultivation is briefly described hereinafter: CeUs are harvested 
from the liquid culture broth by centrifugation or filtration. The harvested cells are washed 
with water, physiological saline of a buffer solution having an appropriate pH. The washed 
cells are suspended in the buffer solution and disrupted by means of a homogenize* soni- 
cator or French press, or by treatment with lysozyme and so on to give a solution or dis- 
rupted cells. The recombinant SNDH is isolated and purified from the cell-free extract or 
disrupted cells, preferably from the cytosol fraction ofthe microorganism. 
The recombinant SNDH can be immobilized on a solid carrier for solid phase enzyme 
reaction. The present invention also provides recombinant cells. The recombinant cefl 
can produce vitamin C and/or 2-KGA from L-sorhosone with the recombinant organisms. 
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In one embodiment, the invention provides a process for the disruption of the gene by 
classical mutagenesis by agents such as UV-irradiation or chemical treatment by any muta- 
tion reagents, e.g.» N-meth>d-2sT-nitro-N-nitrosoguanidine (NTG), 1CR170, acrydine 
orange, and so on, in vivo as well as in vitro. 

In another embodiment, the invention provides a process for the disruption of the gene by 
DNA recombination techniques such as transposon insertion, site directed mutagenesis by 
PCR> and so on, in vivo as well as in vitro. 

In another embodiment, the invention provides a process for producing 2-KGA using the 
disruptants described above by fermentation from appropriate substrate, e.g., L- 
sorbosone, L-sorbose, and D-sorbitoi, in appropriate equipment such as jar fermentors* 
flasks, and tubes- Also the invention provides a process for producing 2-KGA using cell 
free extract of the disruptants described above by incubation from appropriate substrate, 
e.&, L-sorbosone, L-sorbose, and D-sorbitol, in appropriate equipment such as a 
bioreactor, and etc. 

Example 1: Amino acid sequencing from die N- terminus of SNDH 

Partial amino add sequence of the N-terminal 75 kDa subunit of the SNDH protein was 
determined (SEQ ID NO:3)- About 10 jig of the SDS-treated purified SNDH (consisted of 
75 kDa subunits) was subjected to SDS-PAGE, and the protein band was electroblotted 
onto a PVDF membrane. The protein blotted on the membrane was soaked in a digestion 
buffer (100 mM potassium phosphate buffer, 5 mM dithiothreitol, and 10 mM EDTA, pH 
8,0) and incubated with 5.04 of pyroglutamate aminopeptidase (SIGMA, USA) at 30°G 
for 24 hours. After incubation* th6 membrane was washed with deionized water and 
subjected to N-terminal amino add sequencing using an automated amino add sequencer c 
(ABI model 490, Perfcin Elmer Corp., Conn., USA). As a result, 14 residues of the N- 
terminal amino add sequence was obtained as illustrated in SEQ ID NOr3. 

Example 2: Cloning of partial SNDH gene by PCR 

Amplification of partial SNDH gene fragment was carried out by PCR with chromosomal 
DNA of G- oxydans DSM 4025 (PERM BP-3812) and degenerate oligonucleotide DNA 
primers, Pll (SEQ ID NO:6) and P12 (SEQ ID NO:7). Both of the primers were degene- 
rate DNA mixtures having bias for Gluconobacter codon usage. The PCR was performed 
with thermostable taq polymerase (TAKARA Ex Taq™, Takara Shuzo Co., Ltd., Seta 3-4- 
1, Otsu, Shiga, 520-2193, Japan), using a thermal cyder (Gene Amp PCR System 2400-R, 
PE Biosytems, 850 Lincoln Centre Drive* Foster City, CA 94404, USA). The reaction 
mixture (25 pi) consisted of 200 |XM of dNTPs, 50 pmol of each primer (24 - 48 degenera- 
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cy), 5 ng of the chromosomal DNA> and 1.25 units of the DNA polymerase in the buffer 
provided from the supplier. The reaction was carried out with S cycles of 1) denaturatkm 
step at 94°C for 30 sec; 2) annealing step at 37°C for 30 sec; 3) synthesis step at 70°C for I 
min plus 25 cydes of 1) denaturation step at 94*C for 30 sec; 2) qnn**lin g step at 50°C for 
S 30 secj 3) synthesis step at 70°C for 1 min. As a result, 41 bp DNA fragment was specifi- 
cally amplified and doned into vector pCR 2.1-TOPO (Invitrogen, 1600 Faraday Avenue 
Carlsbad, California 92008, USA) to obtain a recombinant plasmid pMTSNi. The dohed 
41 bp-DNA which encodes N-terminal partial amino add sequence of the mature SNDH 
protein, was confirmed the nudcotide sequence by dideoxy-chain termination method (R 
10 Sanger et al> Proc. Nad. Acad. Sd. USA> 74s, 5463-5467, 1977). ..:<-•;*«> . 

* ** 
Example 3: Complete cloning of the SNDH gene 

( 1) Construction of gene library of G. oxyddtis DSM 4025 

The chromosomal DNA of G. oxydons &SM 4025 was prepared from the cells grown on M 
agar medium; 596 D-mannitol, 1.75% corn Steep liquor, 5% baker's yeast, 0.25% 5 

15 MgS0 4 -7HiO, 0.5% CaCOs (Pr.G.), 0.5% urea, and 2.0% agar (pH 7.0), for 4 days at 27«C. 
The chromosomal DNA (4 fig) was partially digested with 4 units of EeoBL I in 20 |Ul of 
reaction mixture. A portion (8 pi) of the sample containing partially-digested DNA frag- 
ments was separated by an electrophoresis using 1% agarose gel. Fragments ranging from 
15 to 35 kb were cut out and chemically mdted to recover the fragments using QIAEX II 

20 (QIAGEN Inc., 28159 Avenue Stanford, Valencia, CA 91355, USA). The objective DNA 
fragments recovered were suspended in H 2 0. On the other hand, 2 fig of a cbsmid vector 
pVKlOO was completdy digested with EcoK I and dephosphorylated of the S'-ends by 
treating of bacterial alkaline phosphatase (Rcoli C75) (Takara Shuzo). The treated 
pVKlOO (220 ng) was ligated witih thel5 - 35 1 kb £caR I .fragments (1 Jig) using a ligation kit 

25 (Takara Shufco) in 36 |j1 of reaction mixture^ The ligated DNA which had been ethanol 
predpitated and resolved in appropriate volume of TE buffer (10 mM Tris-HCl (pH 8.0), 
1 mM EDTA) was used for in vitro packaging (Gigapack HI Gold Packaging Extract, 
Stratagene, 1 101 1 North Torrey Pines Road, La Jolla, CA 92037, USA) to infect £. coU 
VCS257, a host strain for the genomic library. As a result, totally 400,000 - 670,000 dones 

30 containing about 25 kb-inserted DNA fragments were obtained. 

(2) Complete doning of the SNDH gene by colony hybridization. 

The probe that would be used for screening of the cosmid library described above to fish 
up dones carrying the complete SNDH gene by colony hybridization methods was con- 
structed. The 41 bp DNA fragment encoding the N-terminal amino add sequence of 
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SNDH was amplified and labeled by PCR-D1G labeling method (Roche Molecular Systems : 
Inc., 1 145 Atlantic Avenue* Alabama > CA94501, USA). PCR with plasmid pMTSN2 DNA 
as a template and oligonucleotide DMA primers, P 13 (SEQID NO:S) and P14 (SEQ ID 
NO:9), was performed with thermostable taq polymerase (TAKARA Esc Taq™, Takara 

S Shuxo Co., Ltd.), using a thermal cycler (Gene Amp PGR System 2400-R, PE Biosystems). 
The reaction was carried out with 25 cycles of 1) denaturation step at 94°C for 30 sec; 2) 
annealing step at 55°C for 30 sec; 3) synthesis step at 70°C for 1 min. Usingthe DIG- 
labeled probe, screening of the cosmid library (about 1,000 clones) by colony 
hybridization and chemiluminescent detection according to the method provided from the 

10 supplier (Roche Molecular Systems Inc., USA) was carried out Consequently, three 

positive clones were isolated and one of them was designated pVSNS, which carried about 
25 fcb insert DNA in p VK100 vector. The DNA fragments of 3.2 Kb EcoK I, 7.2 kb £coR I, 
and Pst 1 8-0 kb, which contain the upstream, fhe downstream, and the intact of the SNDH 
gene, respectively, were subcloned into pUC18 vector to obtain pUCSNl9, pUCSN5 and 

IS pUCSNP4 3 respectively (Figure 2). 

(3) Nucleotide sequencing of the SNDH gene 

Plasmids pUCSN19, pUCSNS, and pUCSNP4 were used for nucleotide sequencing the 
region including the SNDH gene. Determined nucleotide sequence (SEQ ID NO:l; 3,408 
bp) revealed that ORF of SNDH gene (1,827 bp, nudeotide positions at 258-2084 in SEQ 
20 ID NO:l) encoded the polypeptide of 609 amino acid residues (SEQ ID NO:2). Additional 1 
ORF, ORF-A, was found in the downstream of the SNDH ORF as illustrated in Figure L 
The ORF of ORF-A (1,101 bp, nucleotide positions at 2214-3314 of SEQ ID NO:l) en- 
coded the polypeptide of 367 amino acids. 

In the ORF of the SNDH gene, a signal peptide-like sequence (SEQ ID NO:4* 31 airiino 
25 acids) is possibly included in the deduced amino acid sequence; it contains (i) many 
hydrophobic residues, (ii) positively-chafed residues close to the N-terminus and (Hi) 
Ala-Xaa-Ala motif as a cleaved signal sequence. The putative ribosome-binding site 
(Shine-Dalgarno, SD, sequence) for the SNDH gene was located at 6 bp upstream of the 
initiation codon (AGGAGA at nucleotide positions at 247-252 of SEQ ID NO:l).' 

30 Homology search for genes of the SNDH was performed with the program of FASTA in 
GCG (Genetics Computer Group, Madison, WI, USA). The 80% N-terminal region of the 
deduced amino acid sequence of the SNDH gene has moderate similarity with A calco- 
aceticus soluble quinoprotein glucose dehydrogenase (GDH-B) at the identity of 41%. 
Further analysis by homology search in multiple alignment with the program of ClustalW 

35 revealed that two of very conserved regions around the presumed active site of A. calco- 
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aceticus GDH-B were found among hypothetical proteins denned by genome DNA 
analysis with the SNDH protein. The optional parameters for the analysis are as follows; 
MATRIX: blosum, GAPOPEN: 10.0, GAPEXT: 0.05, GAPDIST: 8, and any other 
parameter was set at de&ult of the program. The result said the either amino add 
S sequences of positions at 224-231 and positions at 259-265 of SEQ ID NO:5 (the amino 
acid sequences of positions at 224-231 and/or 259-265 of SEQ ID NO:5 corresponds to the 
consensus sequences around the active site) were over 85% identical at the corresponding 
regions in the deduced amino acid sequences of AB013367 (Bacillus halodurans unknown 
protein), AE003996 (XyleBafastidiosa hypothetical protein), AE007222 (Sinorhizobium 
10 meKloti plasmid pSymA hypothetical protein), AE009889 (Pyrobaculum aerophilum 

hypothetical protein), AE004541 (Pseuddmonas aeruginosa hypothetical protein), ECAE186 
(RcoK K-12 hypothetical protein), AF472S90 (Sinorhizobium metibti hypothetical 
protein), P73001 (SyhechocysHs sp. hypothetical protein), and two real proteins of A 
OalcoaceHcus GDH-B and the SNDH of this invention. Additionally it was found that the 
is several conserved residues in the presumed active site in A. calcoaceticus GDH-B reported 
byOubrieetaL [J. MoL Biol. 289:319-333 (1999)], which were Arg227,Asn228, Gln230, 
Giy24<S, and Asp25 1 of SEQ ID NO:S, were completely conserved among those sequences. 
On the other hand, the rest 20% C-cerminal region of the SNDH has moderate similarity 
with heme i containing proteins such as c-type cytochrome, cytochrome f, from 
cyanobacterium P. baryakum and cd-1 type nitrite reductase, nir S, from a Paracoccus 
denitrifieans strain. In the similar region (about 32% identity), a motif (Cys-Xaa-Xaa-Cys- 
His) defined as heme c binding was found at positions at 530-534 of SEQ ID NO:2. As a 
result shown in the above, the SNDH protein is thought to be one of qumohemoproteins 
from a genetically analysis. 

Example 4; Expression of the SNDH gene in R caK 

Plasmids PUCSNP4 and pUCSNP9 (Figure 3), which have 8.0 kb Pst I-fragment contain- 
ing the intact SNDH gene, were transformed into E. coH JM109 to confirm the expression 
and the activity of the SNDH proteins. 

The conversion activity of L-sorbosone to vitamin C by using cytosol fraction of the re- 
combinant E coli was tested (Table 1). The cytosol fraction was prepared by ultracentrifu- 
gation (100,000 xg. 45 min) of the cell free extract in 50 mM potassium phosphate buffer 
( P H 7.0). The reaction mixture (100 ul) consisted of 125 ug of cytosol fraction of the re- 
combinant E colt, 50 mM of L-sorbosone, 1.0 mM of phenazine mesosulfate (PMS), and 
additionally 1.0 uM of PQQ and 1.0 mM of CaCl 2 as enactors depending on a case. The 
enzyme reaction was carried out at 30 6 C for 30 minutes. Native holo-SNDHs of the cells 
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cultivated in LB medium containing 10 (iM of PQQ and 1.0 mM of CaCk produced 
vitamin C definitely under the defined reaction condition without the cofactors of PQQ 
and CaCfe. In addition of the cofectors, the apo-emqoue from pUCSNP4 and pUGSNP9 
showed almost the same activity as those of the native holo-enzyme. 



5 Table 1: 



microorganism 


PQQ and Cadi 
in the medium 


Specific activity (mU/mg Protein) 


+ PQQ and CaCl 2 


- PQQ and CaCfe 


£ «ZiJM109/pUCSNP4 


+ 


0.1S7 


0.224 


Ecoti JM109/pUCSNP9 


+ 


0.198 


0.2S2 


£ «>KJMl09/pUC18 


+ 


0.000 


0.000 


£ «KJM109/pUCSNP4 




0.155 


0.000 


£ cofiJM109/pUCSNP9 




0.176 


0.000 


£ coh JM109/pUC18 




0.000 


0.000 


G. axydans DSM 4025 




0.026 


0.026 



10 



15 



20 



of vitamin C in the defined reaction. 

Example 5: Construction and cultivation of SNDH-gene disruptants of <?- axydans 
strains 

Figure 4 shows the scheme for the construction of SNDH gene targeting vector, 
GOMTRlSNrdCm (SNDH-disruptant) . First, plasmid pSUPSN was constructed by a liga- 
tion of 8.0 kb Pst I fragment containing the SNDH gene from plasmid pUCSNP4 with a 
suicide vector pSUP202. Second, a kanamycin-resistant-gene cassette (Km cassette) was 
inserted into the EcoBL I site of the SNDH gene cloned in plasmid pSUPSN to obtain plas- 
mid pSUPSN::Km (KWTc*). Then, plasmid pSUPSN::Km was introduced into GOMTRl, 
which was a rifampicin (Rif) resistant derived spontaneously from wild <3, axydans DSM 
4025 strain, to obtain SNDH-null mutants (Km'RifTc*). 

<?• axydans GOMTRl was cultivated in a 200 ml flask containing 50 ml of T broth, which 
was composed of 30 g/l of Trypticase Soy Broth (BBL; Becton Dickinson and Company, 
Cockeysville, MD 21030, USA) and 3 g/l of yeast extract (Difco; Becton Dickinson Micro- 
biology Systems, Becton Dickinson and Company, Sparks, MD 21152, USA) with 100 
jig/ml of rifempicin at 30°C overnight R cuKHBlOl (pRK2013) [D. H. Figurski, Proc. 
Natl. Acad. ScL USA, 76, 1648-1652, 1979] and£ coli JM109 (pSUPSNL-Km) were culti- 
vated in test tubes containing 2 mi of LB medium with 50 Mfi/ml of kanamydn at 30°C 
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ovemight Cultured cells of GOMTR1, £ co/i HB101 (pRK20l3), andR coli JM109 
(pSUPSNnKm) were collected separately by ccntrifugation and suspended in LB medium 
at the OD of about 20, 2. and 2, respectively. Then these cell suspensions were mixed at 
the same volume and the mixture was spread out on a 0.45 Um nitrocellulose membrane 
(PROTRAN, Schleicher & Schueil GmbH, Postfech 4, D-37S82 Dassel, Germany) put on 
an agar medium, which were composed of 5.0% mannitoL 0.25% MgS04«7H 2 0, 1.75% 
corn steep liquor, S.0% baiter's yeast, 0.5% urea, 0.5% CaCO„ and 2.0% agar, to do conju- 
gal transfer the suicide plasmid from the£ coli donor to GOMTR1 . After cultivation at 
27°C for a day, the cells containing transconjugants were suspended and diluted appro- 
priately with T broth, and spread out on the screening agar plates containing 100 fig/ml of 
rifempidn and 50 ug/ml ofkanainydh. Finally/several objective transconjugants 
(Rrn'RifTc") which had the disrupted SNDH gene with Km cassette were obtained. 

GOMTR1 and the disruptants, GOMTRlSN::Km, were grown on an agar plate containing 
8.0% L-sorbose, 0.25% MgS0 4 '7H 2 0, 1.75% corn steep liquor, 5.0% baker's yeast, 0.5% 
urea, 0.5% CaC0 3 , and 2.0% agar at 27°C for 4 days. One Ioopful of the cells was inocu- 
lated into 50 ml of a seed culture medium (pH 6.0) containing 4% D-sorbitol, 0.4% yeast 
extract, 0.05% glycerol, 0.2S% MgS0 4 «7H a O, 1.75% corn steep liquor, 0.J% urea, and 
1.5% CaCOj in a 500 ml Erlenmeyer flask, and cultivated at 30*C with 180 rpm for one day 
on a rotary shaker. The seed culture thus prepared was used for inoculating 50 ml of a 
main culture medium, which composed of 12.0% L-sorbose, 2.0% urea, 0.05% glycerol, 
0.25% MgSCy7H 2 0, 3.0% corn steep Uquor, 0.4% yeast extract, and 1.5% CaCO s in a 500 
ml Erlenmeyer flask. The cultivation was carried out at 30 6 C and 180 rpm for 4 days. As 
shown in Table 2, the SNDH-gene disruptants had about 3% higher of the molar conver- 
sion yield against the substrate consumed than that of the parent strain. 

Table2 



Strain 


2KGA 


Residual L-sorbose 


*Molar yield 




(g/L) 


(g/L) 


(mol %) 


GOMTRlSN::Km 


96.7 


F 153 


99.2 


GOMTRl 


98.8 


9.8 


95.5 



Example 6i Introduction of the plasmids carrying the SNDH gene into the SNDH-gene 
disruptant of G. oxydans DSM 4025 
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Several kinds of SNDH-expression plasmids using broad host range vector pVKlOO were 
constructed as shown in Figure 5. Those plasmids have different insert DNAs at the Hind 
HI site of pVKlOO described as follows: pVSN117 has the insert DNA containing thein- 
complete SNDH gene encoding a polypeptide until GlyS35 of SEQ ID NO:5, Lc a C-ter- 
minal deleted SNDH gene, which expresses only 55 kDa protein, pVSN106 and pVSN114 
have the insert DNA containing the intact SNDH gene. Those plasmids were introduced 
into strain GOMTRJ SN::Km, SNDH-gene disruptant derived from G. oxydans DSM 4025, 



These transconjugants having the plasmids shown in Figure 5 were grown on an agar plaU 
containing 10.0% ^sorbose, 0.25% MgS0 4 '7H a O, 1.75% corn steep liquor, 5.0% baker's 
yeast, 0.5% urea, 0.5% CaCCb, and 2.0% agar at 27*0 for 4 days- The enzyme reaction 
mixture consisted of 80 ug of cell free extract of the recombinant Ghiconobacter strains, 2S 
mM potassium phosphate buffer (pH 7.0), 50 mM of L-sorbosone, and.0.05 mM of PMS. 
The enzyme reaction was carried out at 30°C for 30 min with shaking at 1,000 rpm. The 
result is shown in Table 3. 




Table 3 



Host cell 



Vector DNA 



Vitamin C produced (mg/L) 



GOMTRlSNsKm-2 



pVKlOO 
pVSN117 
pVSN106 
pVSN114 



473.2 
845.3 



860.2 . 



•8 IM Ull 



1*26889 190 did JOI03S1IA alH 



8S : frl im-m'Li 



sard isi m [9.-91 zm/w/Lzi^^-m 




1. An isolated nucleic add molecule encoding aldehyde dehydrogenase which comprises a 
polynucleotide sequence at least 95% identical to the nucleotide sequence of SEQ ID NO:l. 

2. An isolated nucleic acid molecule encoding aldehyde dehydrogenase which comprises a 
5 polynucleotide sequence at least 95% identical to the polynucleotide selected from the 

group consisting of (a) nucleotides 258-2084 of SEQ ID NO:l, (h) nucleotides 3S1-2084 of 
SEQ ID NO;l, (c) nucleotides 258-1955 of SEQ ID NO:l, and (d) nucleotides 351-1955 of 
SEQpNO:l. 

3. ,An isolated nucleic acid molecule encoding aldehyde dehydrogenase wMch comprises a 
10 polynucleotide selected from the group consisting of (a) a polynucleotide encoding the 
polypeptide consisting of amino addsofSEQ ID NTO:2, (h) a polynucleotide encoding the 
polypeptide consisting of amino acids 32-609 of SEQ ID NO:2, (c) a polynucleotide en- 
coding the polypeptide consisting of amino acids 1-566 of SEQ ID NOS, and (d) a poly- 
nucleotide encoding the polypeptide consisting of amino adds 32-566 of SEQ ID NO:2. 



15 



4. An isolated nudeie add molecule encoding a polypeptide having aldehyde dehydro- 
genase activity, wherein the complement of said nudeic add molecule hybridizes under 
standard conditions wkh tiienuddc add molecule of any one of daims 1 to 3. 

5. An expression vector which comprises the nudeic acid molecule of any one of daims 1 

to 4. ■' 

20 6. The expression vector of claim S, wherein said vector is derived from pQE-plasroids, 
pUC-plasmids, pBluescript II, pACYC177, pACYCl84, and their derivative plasmids, and 
a broad host range plasmid such as pVKlOO and RSF1010. 

7. A recombinant microorganism which is transformed with the expression vector of claim 
5 or 6. 

2s 8. The recombinant microorganism of daim 7, wherein said microorganism comprises the 
nucleic add molecule of any one of claims 1 to 4 on its chromosomal DNA 

9. The recombinant microorganism of claim 7 or 8, wherein said microorganism is 
selected from the group consisting of bacterial cells, yeast cells, and plant cells. 
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10. The recombinant microorganism of claim 9, wherein said microorganism is a member 
of the genus selected from the group consisting of Gluconobacter, Acetobacter, Pseuda- 
tnonas, Klebsiella, Acinetobacter> and Escherichia. 

I L A process for the production of vitamin C and/or 2-KGA from L-sorbosone 
5 comprising (a) propagating the recombinant microorganism of claim 7 in an appropriate 
culture media, and (b) recovering and separating vitamin C and/or 2-KGA from said 
culture media. 

12. A process for the production of vitamin C and/or 2-KGA from L-sorbosone 
comprising (a) propagating a recombinant organism in an appropriate culture media, 
10 wherein the nucleic add molecule of any one of claims 1 to 4 is heterologous^ introduced 
to said recombinant organism, and (b) recovering and separating vitamin C and/or 2-KGA 
from said culture media. 

13- A process for the production of vitamin C and/or 2-KGA from L-sorbosone 
comprising (a) propagating a recombinant organism in an appropriate culture media, 
IS wherein a nucleic acid molecule comprising a polynucleotide encoding a polypeptide 
whose consensus amino acid sequences around the active site are at least 85% identical to 
those of the polypeptide of SEQ ID NO:5 is heterologously introduced to said recombinant 
organism, and (b) recovering and separating vitamin C and/or 2-KGA from said culture 
media. 

20 14. A process for the production of 2-KGA via L-sorbosone from an appropriate sugar 
compound comprising (a) propagating a microorganism belonging to Cluconobacter 
oaydans DSM 4025 in an appropriate culture media, wherein a gene encoding aldehyde 
dehydrogenase encoded by any one of claims 1 to 4 is disrupted in said microorganism, 
and (b) recovering and separating 2-KGA from said culture media. 

25 15. A process according to claim 1 4> wherein the sugar compound is selected from die 
group consisting of L-sorbosone, D-glucose, D-sorbitol, and L-sorbose. 

16. A process for the production of aldehyde dehydrogenase encoded by any one of claims 
1 to 4, the process comprising (a) propagating the recombinant microorganism of daim 7 
in an appropriate culture media, and (b) recovering and separating said aldehyde dehydro- 

30 genase from said culture media. 

17. A process for the production of aldehyde dehydrogenase encoded by any one of claims 
1 to 4, the process comprising (a) propagating a recombinant organism in an appropriate 
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culture media, wherein the nudeic add molecule of any one of daims 1 to 4 is heterolo- 
gously introduced to said recombinant organism, and (b) recovering and separating said 
aldehyde dehydrogenase from said culture media. 

18- A process for the production of aldehyde dehydrogenase encoded by any one of claims 
5 1 to 4, the process comprising (a) propagating a recombinant organism in an appropriate 
culture media, wherein a nudeic add molecule comprising a polynudeotide sequence en- 
coding a polypeptide whose consensus amino add sequences around the active site are at 
least 85% identical to those of the polypeptide ofSEQ ID NO:5 is heterologous^ intro- 
duced to said recombinant organism, and (b) recovering and separating said aldehyde 
to dehydrogenase from said culture media. 

*** 
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SEQUENCE LISTING 

<:110> Roche Vitamins AG 

<120>- A gens encoding aldehyde dehydrogenase and use thereof 
<:130> WDR5237 
5 <£l40> 
<141> 
<160> 9 

<170> Pat en tin Ver- 2.1 

10 <210* 1 

<211> 3408 
<212* DMA 

<213> Gluconobacter oxydans 
<220> 
15 . <221> CDS 

<222> <2SS) . . (2087) 

<220> 

<22l> CDS 

<222> (2214) . . (3317) 

20 

<400s 1 

GCGACTGGCA GCAGCGCAAC TATGACCACT ATGGCCTGCC GCCCTATTGG 50 

ATCTAACTCA TCCAGTAAGC CACCATCAGC CGGCCCCTGC GGGGGCCGGC 100 

TTTTTGCGCT AGACCCCGCC GAGGTGCTGT CGlAACCTAA GGTCACATCT ISO 

25 TTACTTCCAC ATCCGCCCTT GTCAGTTCTG ACGTGACAAA TOGTCGCGGT 200 

CATGCTGCTG AATGCGGATG CCAGTCCCAG ATCCAAGCCC GACGCAAGGA 250 

GACGTAGATG TTACCCAAAT CATTGAAACA TAAGAATGGC GCCATGCGCC 300 

TTCTCGCAGC CTCGACCCTT GCGCTGATGA TCGGCGCGGG TGCCCATGCG 350 

CAGGTAAACC CGGTCGAAGT GCCG&TGGGC GCGAACGAGA CCTTTACCTC 400 

30 GCGCGTGCTG ACCACCGGCC TGTCGAACCC TTGGGAAATC ACCTGGGGCC • 450 

CCGACAATAT GCTGTGGGTG ACCGAGCGAT CTTCCGGCGA AGTGACGCGC 500 

GTCGACCCCA ATACCGGCGA GCAGCAGGTC CTGCTGACCC TGACCGATTT 55 0 

CAGCG^CGAT GTGCAACACC AGGGCCTACT TGGCCTCGCG CtGCATCCTG 600 

AGTTTATGCA AGAGAGCGGC AACGACTACG TCTATATCG* CTACACTTAT 650 

35 AACACCGGCA CCGAAGAAGC GCCCGXTCCG CATCAAAAGC MGTGCGTTA 700 

TGCCTATGAC GCTGCCGCGC AGCAGCTGGT CGATCCGGTO GATCTGGTCG 750 

CAGGCATTCC CGCAGGCAAC GACCACAATC GCGGTCGC2& CAAATTCGCC 800 

CCCGATGGCC AACACATCTT TTACACGCTG GGCGAGCAAG GCGCGAACTT 850 

TGGCGGTAAC TTCCGCCGTC CGAACCACGC GCAACTGCTG CCGACGCAAG 900 

40 AGCAGGTCGA CGCGGGCGAT TGGGTCGCCT ATTCGGGCAA GATCCTGCGC 950 

GTGAACCTTG ACGdCACGAT CCCCGAAGAC AACCCCGAGA TCGAGGGCGT 1000 

GCGTAGCCAT ATCTTTACCT ATGGCCACCG TAACCCGCAG GGCATCACCT 105 0 

TTGGCCCCGA CGGCACCATT TATGCCACCG AAGACGGCCC CGATACGGAT 1100 

GACGAGCTGA ACATCATCGC CGGCGGTGGC AACTATGGGT GGCCGAATGT 1150 
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GGCCGGCTA.T CGCGATGGCA AATCCTATGT CTACGCTGAT TGGAGCCAAG 1200 
CGCCCQCTOA CCAGCGTTAC ACCGGTCGCG CCGGTATCCC CGACACCGTG 1250 
CCGCAATTCC CCGAGCTGGA ATTCGCGCCC GAGATGGTCG ATCCGCTGAC 1300 
AACCTATTGG ACGGTGGATA ATGATTACGA TTTCACCGCC AATTGCGGCT 1350 
S GGATCTGTAA TCCGACGATC GCGCCTTCGT CTGCCTATFA CTATGCGGCG 1400 
GGCGAGAGCG GTATCGCGGC TTGGSATAAX TCSATCCTGA TCCCGACGCT 14S0 
GAAACATG GC GGCATCTATG TGCAGCACCI CAGCGATGAT GGCCAATCTG 1500 
TCGACGGCCT GCCCGAGCTG TGGTTCAGCA CCCAGAACCG CTATCGCGAT 15S0 
ATCGAGATCA GCCCCGATAA CCATGTTTTT GTGGCGACCG ACAACTTTGG 1600 
10 CACCTCGGCS CAGAAATATG GCGAGACCGG CTTTACCAAC GTGCTGCATA 1650 
ACCCCGGCGC GATCCTTGTC TTTAG CTATG TCGGCGAGGA TGCTGCSGGT 1700 
CAGACCGGAA TGATGACCGC GCCCGCACCQ CAGACGCAAT ACACGCAAGT 17S0 
- <3CCCGCeGAG GGTGCAGGCG CGG6CGCGAC 'TGAGGTTGCG GATGTCGATT 1800 
ACGACACGCT GTTCACCQAA GGCCAGACCC ' TTTATGG CAG CGCAT6TGCC 1850 
GCGTGCCATG GTGCCGCTGG CCAAGGTGCG CAGGGCCCOA CCTTTGTGGG IdOO 
CGTGCCGGAT GTGACGGGTG ACAAGGACTA CCMGCCCGC ACCATCATCC 1350 
ACGGTTTTGG CTATATGCCG TCGTTTGCGA CTCGGCTGGA TGACGAGGAG 2000 
GTT GCCGCCA TCGCGACCTT TATCCGCAAC AGCTGGGGCA ATGACGAAGG 2050 
CATCCTGACC CCGGCCGAGG CCGCTGCCAC CCGCTGAATG CTGTAAAAAC 2100 
CACCCTCGCC TGCACATCAG GCGGGGGTAT TTCATTTATO TTCACATCTG 2150 
CCTTTGACAT GTGCCGCTAT CACGGTTAAT GCGGCCCTTC GGCTGTTCTG 2200 
GGTCTAAGCG GGTGTGTTGC CCGATAAGAG AGACGGTTCA 6TCCCTCCCG 2250 
CCCTATTTAG GGCCGATTTA GGCAGAATAG TTTTGACTCA TCAAAATATC 2300 
GCCGCGCCTC TGGCCGCGGC CCTTTCGCAA CGTGGATATG AAACGCTGAC 2350 
CGCCGTGCAG CAAGCTGTGC TCGCGCCCGA GGCTGATGGC CGCGACCTGC 2400 
TGGTGTCGGC ACAGACCGGT TCGGGTAAGA CGGTGGCCTT TGGTATCGCA 2450 
GTCGCGCCCG ACCTTTTGG6 CGACGACAAT ATCCTGCCGC TGAACACGCC 2500 
GCCTGTTGCG CTGTTCATCG CCCCCA.CGCG CGAGCMGCG CTGCAAGTTG 25S0 
CTCAGGAACT GACCTGGCTT TACGCCAATG CAGGTGCCCA GATCGCGACC 2600 
TGCGTCGGCG GTATGGATTA CCGCACCGAG CGCCGCGCCC TTGCACGTCT 2650 
GCCGCAAATC GtTGTCGGCA CGCCCGGCCG TCTGCGCGAC CATATCQACC 2700 
GTGGCGGCCT TGACCTGTGC GAAfTGCGCG TGACCGTGCT GGACGAAGCG 2750 
GATGAGATGC TCGACCTCGG CTCCCGCGAT GATCTGCAAT ATATCTTGCA 2800 
AGCCGCGCCC GAAGATCGCC GCACGCTGAT GTTCTCGGCC ACCOTGCCGC 28S0 
GCGAGATTGA AAAACTGGCC CGCGACTTCC AAAATGACGC CCTGCGTCTG 2900 
GAAACCCGTG GCGAGGCCAA GCAGCACAAC GACATCAGCT ACCAAGCTTT 2SS0 
GXCGGTCACC ATGCGCC3ATC GCGAAAACGC CATTTTCAAC ATGCTGCGTT 3000 
TTTATGAATC GCGCACG6CG ATCATCTTCT GCAAGACCCG CGCCAATGTG 3050 
AATGATCTGC TGTCGCGGAT GAGCGGTCGT GGCTTCCGCG TGGTGGCCCT 3100 
GTCGGGCGAG CTGTCGCAAC AGGAACGCAC CAACGCGCTG CAAGCGCTGC 3150 
STGATGGCCG CGCCAACGTT TCTATCGCGA CCGACGTCGC GGCGCGCGGC 3200 
ATTGACTTGC CGGGCCTCGA GCTGGTGATC CACTACGATC TGCCGACCAA 3250 
TGCCGAAACC CTGCTGCACC GCTCGGGCCG TACCGGCCGC CGGGTGCCAA 3300 
GGGCSTCTCG GC6CTGATCG TCACCCCCGG CGATTTCAAA AAAGCGCAGC 3350 
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GTTTGCTGAG CTTTGCCAAA GTGACCGCGG AATGGGGCAA 
GCCGAAGA 



GGCGCCTTCG 3400 



3408 



5 <210> 2 
<2ll* 609 
<212> PRT 

<213> Gluconobacfcer oxydana 
<220> 
10 <221> SIGNAL 

<222> (1) . . (31) 
<*220* 

<22l:> CHAIN 
<222> (32) . . (609) 

15 

*400> 2 

Met Leu Pro Lys Ser Leu Lys His Lys Asn Gly Ala Met Arg Leu 15 
. Val Ala Ala Ser Thr Leu Ala Leu Met lie Gly Ala Gly. Ala His 30 

20 

Ala Gin Val Asrn Pro val Glu Val Pro Val Gly Ala Asn Glu Thr 45 
Phe Thr Ser Arg Val Leu Thr Thr Gly Leu Ser Asn Pro Txp Glu 60 
25 He Thr Trp Gly Pro Asp Asn Met Leu Trp Val Thr Glu Arg Ser 75 
Ser Gly Glu Val Thr A*g Val Asp Pro Asn Thr Gly Glu Gin Gin 90 
Val Leu Leu Thr Leu Thr Asp Phe Ser Val Asp Val Gin His Gin 105 

30 

Gly Leu Leu Gly Leu Ala Leu His Pro Glu Phe Met Gin Glu. Ser 120 
Gly Aen Asp Tyr Val Tyr He Val Tyr Thr Tyr Asn Thr Gly Thr 135 
3S Glu Glu Ala Pro Asp Pro His Gin Lys Leu Val Arg Tyr Ala Tyr 150 
Asp Ala Ala Ala Gin Gin Leu Val Asp Pro Val Asp Leu Val Ala 16 S 
Gly Zle Pro Ala Gly Asn Asp His Asn Gly Gly Arg He Lys Phe 180 

40 

Ala Pro Asp Gly Gin His He Phe Tyr Thr Leu Gly Glu Gin Gly 195 
Ala Asn Phe Gly Gly Asn Phe Arg Arg Pro Asn His Ala Gin Leu 210 
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Leu Pro Thr Gin Glu Gin Val Asp Ala (Sly Asp Trp Val Ala Tyr 225 
Ser Gly Lys He Leu Arg Val Asn Leu Asp Gly Thr He Pro Glu 240 
5 Asp Asn Pro Glu He Gin Gly Val Arg Ser His He Phe Thr Tyr 255 
Gly Hie Arg Asn Pro Gin Gly He Thr Phe Gly Pro Asp Gly Thr 270 
He Tyr Ala Thr Glu His Gly Pro Asp Thr Asp Asp Glu Leu Asn 285 

10 

. He lie Ala Gly Gly Gly Asn Tyr Gly Trp Pro Asn Val Ala Gly 300 

Tyr Arg Asp Gly Lys Ser Tyr Val Tyr Ala Asp Trp Ser Gin Ala 315 

15 Pro Ala Asp Gin Arg Tyr Thr Gly Arg Ala Gly He Pro Asp Thr 330 

Val Pro Gin Phe Pro Glu Leu Glu Phe Ala Pro Glu Met Val Asp 345 

* ■ * • ., - 

Pro Leu Thr Thr, Tyr Trp .Thr Val Asp Asn Asp Tyr Asp Phe Thr 360 

20 

Ala Asn Cys Gly Trp He Cys Asn Pro Thr He Ala Pro Ser Ser 375 
Ala Tyr Tyr Tyr Ala Ala Gly Glu Sar Gly He Ala Ala Trp Asp 350 
25 Asn Ser He Leu He Pro Thr Leu Lys His Gly Gly He Tyr Val 405 
Gin His Leu Ser . Asp Asp Gly Gin . Ser Val Asp Gly Leu Pro Glu 420 
Leu Trp Phe Ser Thr Gin Asn Arg Tyr Arg Asp He Glu He Ser 435 

30 

Pro Asp Asn His Val Pne val Ala Thr Asp Asn Phe Gly Thr Ser 450 
Ala Gin Lys Tyr Gly Glu Thr Gly Phe Thr Asn Val Leu His Asn 465 
35 Pro Gly Ala He Leu Val Phe Ser Tyr Val Gly Glu Asp Ala Ala 480 
Gly Gin Thr Gly Met Met Thr Ala Pro Ala Pro Gin Thr Gin Tyr 455 
Thr Gin Val Pro Ala Glu Gly Ala Gly Ala Gly Ala Thr Glu Val 510 

40 

Ala Asp Val Asp Tyr Asp Thr Leu Phe Thr Glu Gly Gin Thr Leu 525 
Tyr Gly Ser Ala Cys Ala Ala Cys His Gly Ala Ala Gly Gin Gly 540 
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Ala Gin Gly Pro Thr She Val Gly Val Pro Asp Val Thr Gly Asp- 555 
Lye Asp Tyr Leu Ala Arg Tlir lie lie His Gly Phe.Gly Tyr Mec 570 
5 Fro Ser Phe Ala Thr Arg Leu Asp Asp Glu Glu Val Ala Ala He 585 
Ala Thr Phe He Arg Asn Ser Trp Gly Aan Asp Glu Gly He Leu 600 
Thr Pro Ala Glu Ala Ala Ala Thr Arg 603 



<210s> 3 
<211> 14 
<212> PRT 
IS <213> Gluconobacter oxydans 

<400> 3 

Gin [xaa/Gly] Asn [Pro/Lys] Val Glu Val Pro Val Gly Ala Aen Glu Thr 14 

20 

*210* 4 
<:211:> 31 
<212> PRT 

<213> Gluconobacter oxydans 

25 

<220:> 

<221> SIGNAL 
<J222> (1) - - (31) 

30 <400s> 4 

Mer Leu Pro Lys Ser Leu Lys His Lys Asn Gly Ala 'Met Arg Leu 15 
Val Ala Ala Ser Thr Leu Ala Leu Met lie Gly Ala Gly Ala His 30 
35 Ala 31 



<:210> 5 
<211:> 578 
40 <212> PRT 

<212> Gluconobacter oxydans 

<220> 

<221> CHAIN 
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<222> (l)_(57d) 
<4Q0> 5 

Gin Val Asn Pro Val Glu Val Pro Val Gly Ala Asn Glu Thy Ph e 15 



Thr Ser Arg Val Leu Thr Thr Gly Leu Ser Asn Pro Trp Glu He 30 

Thr Trp Gly Pro Asp Asn Men Leu Trp Val Thr Glu Arg Ser Ser 45 

10 Gly Glu Val Thr Arg Val Asp Pro Asn Thr Gly Glu Gin Gin Val 60 

Leu Leu Thr Leu Thr Asp Phe Ser Val Asp Val Gin His Gin Gly 75 

Leu Leu Gly Leu Ala Leu His Pro Glu Phe Met Gin Glu Ser Gly 90 



Asn Asp Tyr Val Tyr He Val Tyr Thr Tyr Asn Thr Gly Thr Glu 105 
Glu Ala Pro Asp Pro His Gin Lys Leu Val Arg Tyr Ala Tyr Asp 120 
20 Ala Ala Ala Gin Gin Leu Val Asp Pro Val Asp Leu Val Al* Gly 135 
He Pro Ala Gly Asn Asp His Asn Gly Gly Arg He Lys Phe Ala 150 
Pro Asp Gly Gin His He Phe Tyr Thr Leu Gly Glu Gin Gly Ala 165 
Asn Phe Gly Gly Asn Phe Arg Arg Pro Asn His Ala Gin Leu Leu 180 
Pro Thr Gin Glu Gin Val Asp Ala Gly Asp Trp Val Ala Tyr Ser 195 
30 Gly Lys He Leu Arg Val Asn Leu Asp Gly Thr He Pro Glu Asp 210 
Asn Pro Glu lie Glu, Gly Val Arg Ser His Xle Phe Thr Tyr- Gly 255 
His Arg Asn Pro Gin Gly lie Thr Phe Gly Pro Asp Gly Thr. He 240 
Tyr Ala Thr Glu His Gly Pro Asp Thr Asp Asp Glu Leu Asn He 255 
He Ala Gly Gly Gly Asn Tyr Gly Trp Pro Asn Val Ala Gly Tyr 270 
40 Arg Asp Gly Lys Ser Tyr. Val Tyr Ala Asp Trp Ser Gin Ala Pro 285 
Ala Asp Gin Arg Tyr Thr, Gly Arg Ala Gly He Pro Asp Thr Val 300 
Pro Gin phe Pro Glu Leu Glu Phe Ala Pro Glu Mec Val Asp Pro 315 
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Leu Thr Thr Tyr Trp Thr Val Asp Asn Asp Tyr Asp Phe Thr Ala 330 
Asn Cys Gly Trp lie Cys Asn Pro Thar lie Ala Pro Ser Ser .Ala 345 •■ 

S 

Tyr Tyr Tyr Ala Ala Gly C3lu Ser Gly He Ala Ala Trp Asp Asn .360 
Sex He Leu He Pro Thr Leu Lys His Gly Gly He Tyr Val Gin 375 
10 His Leu Ser Asp Asp Gly Gin Ser Val Asp Gly Leu Pro Glu Leu 390 
Trp Phe Ser Thr Gin Asn Arg Tyr Arg Asp He. Glu He Ser .Pro 4 OS 
Asp Asn His Val Phe Val Ala Thr Asp Asn Phe Gly Thr Ser Ala 420 

15 

Gin Lys Tyr Gly Glu Thr Gly Phe Thr Asn Val Leu His Asn Pro 43 S 
Gly Ala He Leu Val Phe Ser Tyr Val Gly Glu Asp Ala Ala .Gly 450 
20 Gin Thr Gly Met Met Thr Ala Pro Ala Pro Gin Thr Gin Tyr Thr 465 
Gin Val Pro Ala Glu Gly Ala Gly Ala Gly Ala Thr Glu Val Ala 480 
Asp Val Asp Tyr Asp Thr Leu Phe Thr Glu Gly Gin Thr Leu Tyr 495 

25 

Gly Ser Ala Cys Ala Ala Cys His Gly Ala Ala Gly Gin Gly Ala 510 
Gin Gly Pro Thr Phe Val Gly Val Pro Asp Val Thr Gly Asp Lys 525 
30 Asp Tyr Leu Ala Arg Thr He He His Gly Phe Gly Tyr Met Pro 540 
Ser Phe Ala Thr Arg Leu Asp Asp Glu Glu val Ala Ala He Ala 555 
Thr Phe He Arg Asn Sex Tirp Gly Asn Asp Glu Gly lie Leu Thr 570 

35 

Pro Ala Glu Ala Ala Ala Thr Arg. 576 

<210> 6 
40 <211> 17 
<212> DNA 

<213> Artificial Setjuenee 
<220> 
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<223> an artificially synthesized primer sequence 
<400> 6 

earggyaacc csgtbga 

5 

<210> 7 
<211> 17 
<212> DNA 

<213:> Artificial Sequence 

10 

<223> an artificially synthesized primer seijuence 
<220> 

IS *221> xnisc_feature 
<222> $ 

<223s> n is a or g- or c or t » 

<400> 7 ; 
20 gi:ytccrttog crccvac 

' - ; 1 • . «... 

<210> B . 
<211> 15 
25 <212> t)NA 

*213> Artificial Sequence 

<220> 

<223> an . artif icially - synthesized primer sequence 

30 

<4oo> a 

caggcftaacc cggtc 

35 <210> S 
<211> 15 
<2122> DHA • 

<213> Artificial Secfuence 
40 <220> 

<223> an artificially synthesized primer sequence 
<400:> 9 

gactcgtttg cg/ccc 
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